Control-flow checking using watchdog assists and extended-precision checksums

نویسندگان

  • Nirmal R. Saxena
  • Edward J. McCluskey
چکیده

troller), it has the advantage of maintaining sequential consistency, thus allowing parallel programs to work as expected. VII. SUMMARY AND CONCLUSIONS The delays due to error checking being performed in series with intermodule communication are one of the primary causes of performance degradation associated with implementing concurrent error detection and correction in VLSI systems. This performance penalty may be caused by dedicated checkers in the communication paths between each module and the rest of the system or the need to wait for the hardware to perform redundant operations that verify the validity of the " recent " results. Checks that are area efficient are usually serial and slow, and faster checks are often based on checkers which take up large area, thus slowing down surrounding circuits. Checking delays are compounded when data are transferred between several modules, and checked at several places. This fundamental problem in achieving fault tolerance in high-performance VLSI systems can be overcome by performing checks on the data in parallel with intermodule communication. Data to be checked can be latched, and error checking can take place in one or more subsequent cycles (as a pipeline). As a consequence, error signals can arrive one or more cycles after error-damaged data are received for processing. In this paper, we have described a mechanism , called micro rollback, which allows checking to proceed in parallel with communication by supporting fast rollback of a few cycles when a delayed error signal arrives. Micro rollback is a powerful technique that facilitates the implementation of high-performance VLSI systems which are also highly fault tolerant. It allows a variety of concurrent error detection and correction techniques to be used with minimal performance penalty. With micro rollback, it is feasible to operate systems in hostile environments, where there is a high rate of transient faults, due to the ability of individual modules to initiate rollback and retry which can complete in a few cycles without resorting to expensive system-wide rollbacks. We have presented a systematic way to design VLSI computer modules that can roll back and restore the state which existed when the error occurred. Specifically, the implementation of micro rollback in simple synchronous systems involves replication of small isolated registers and the use of full delayed-write buffers (DWB's) for storing recent state changes to large register files. When applied to a VLSI RISC processor, the micro rollback technique is characterized by extremely low …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Diagnosis Procedures for k-out-of-n Structures

REFERENCES A. Mahmood and E. J . McCluskey, “Concurrent error detection using watchdog processorA survey,” IEEE Trans. Cornput., vol. C-37, no. 2, pp. 160-174, Feb. 1988. M. Schuette and J . Shen, “Processor control flow monitoring using signatured instruction streams,” IEEE Trans. Cornput., vol. C-36, no. 3 , pp. 264-276, Mar. 1987. N. R. Saxena and E. J. McCluskey, “Extended precision checksu...

متن کامل

A High-speed Watchdog Processor for Multitasking Systems

A new watchdog processor scheme for concurrent checking of program control flow is presented. This method is intended to check state of the art processor architectures with on-chip caches as building blocks of multiprocessor systems. The signatures are assigned, so that the processor instruction bus needs not be monitored. The run-time and reference signatures are embedded into the checked prog...

متن کامل

Multiprocessor Checking Using Watchdog Processors

A new control flow checking scheme is presented, based on assigned-signature checking using a watchdog processor. This scheme is suitable for a multitasking, multiprocessor environment. The hardware overhead is comparatively low because of three reasons: first, hierarchically structured, the scheme uses only a single watchdog processor to monitor multiple processes on multiple processors. Secon...

متن کامل

Hierarchical Checking of Multiprocessors Using Watchdog Processors

A new control flow checking scheme, based on assigned-signature checking by a watchdog processor, is presented. This scheme is suitable for a multitasking, multiprocessor environment. The hardware overhead is comparatively low because of three reasons: first, hierarchically structured, the scheme uses only a single watchdog processor to monitor multiple processes or processors. Second, as an as...

متن کامل

Control-flow checking by software signatures

This paper presents a new signature monitoring technique, CFCSS (Control Flow Checking by Software Signatures); CFCSS is a pure software method that checks the control flow of a program using assigned signatures. An algorithm assigns a unique signature to each node in the program graph and adds instructions for error detection. Signatures are embedded in the program during compilation time usin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 39  شماره 

صفحات  -

تاریخ انتشار 1989